Direct numerical simulation of turbulence using GPU accelerated supercomputers

نویسندگان

  • Ali Khajeh-Saeed
  • J. Blair Perot
چکیده

Direct numerical simulations of turbulence are optimized for up to 192 graphics processors. The results from two large GPU clusters are compared to the performance of corresponding CPU clusters. A number of important algorithm changes are necessary to access the full computational power of graphics processors and these adaptations are discussed. It is shown that the handling of subdomain communication becomes even more critical when using GPU based supercomputers. The potential for overlap of MPI communication with GPU computation is analyzed and then optimized. Detailed timings reveal that the internal calculations are now so efficient that the operations related to MPI communication are the primary scaling bottleneck at all but the very largest problem sizes that can fit on the hardware. This work gives a glimpse of the CFD performance issues will dominate many hardware platform in the near future. 2012 Elsevier Inc. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scalings of Inverse Energy Transfer and Energy Decay in 3-D Decaying Isotropic Turbulence with Non-rotating or Rotating Frame of Reference

Energy development of decaying isotropic turbulence in a 3-D periodic cube with non-rotating or rotating frames of reference is studied through direct numerical simulation using GPU accelerated lattice Boltzmann method. The initial turbulence is isotropic, generated in spectral space with prescribed energy spectrum E(κ)~κm in a range between κmin and ...

متن کامل

GPU accelerated lattice Boltzmann simulation for rotational turbulence

In this work, we numerically study decaying isotropic turbulence in periodic cubes with frame rotation using the lattice Boltzmann method (LBM) and present the results of rotation effects on turbulence. The implementation of LBM is on a GPU (Graphic Processing Unit) platform using CUDA (Compute Unified Device Architecture). Through the accelerated GPU-LBM simulation, we look into various effect...

متن کامل

A CUDA Implementation of the High Performance Conjugate Gradient Benchmark

The High Performance Conjugate Gradient (HPCG) benchmark has been recently proposed as a complement to the High Performance Linpack (HPL) benchmark currently used to rank supercomputers in the Top500 list. This new benchmark solves a large sparse linear system using a multigrid preconditioned conjugate gradient (PCG) algorithm. The PCG algorithm contains the computational and communication patt...

متن کامل

High-performance Computation and Visualization of Plasma Turbulence on Graphics Processors

Direct numerical simulation (DNS) of turbulence is computationally very intensive and typically relies on some form of parallel processing. Spectral kernels used for spatial discretization are a common computational bottleneck on distributed memory architectures. One way to increase the efficiency of DNS algorithms is to parallelize spectral kernels using tightlycoupled Single-Program-Multiple-...

متن کامل

Numerical Simulation of Flash Boiling Effect in a 3-Dimensional Chamber Using CFD Techniques

 Flash Boiling atomization is one of the most effective means of generating a fine and narrow-dispersed spray. Unless its complexity its potential has not been fully realized. In This Paper, a three dimensional chamber has been modeled with a straight fuel injector. Effect of Flash Boiling has been investigated by computational fluid dynamics (CFD) techniques. A finite volume approach with the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • J. Comput. Physics

دوره 235  شماره 

صفحات  -

تاریخ انتشار 2013